On Theoretically Valid Score Distributions in Information Retrieval

نویسندگان

  • Ronan Cummins
  • Colm O'Riordan
چکیده

In this paper, we aim to investigate the practical usefulness of the Recall-Fallout Convexity Hypothesis (RFCH) for a number of document score distribution (SD) models. We compare SD models that do not automatically adhere to the RFCH to modified versions of the same SD models that do adhere to the RFCH. We compare these models using the inference of average precision as a measure of utility. For the three models studied in this paper, we conclude that adhering to the RFCH is practically useful for the two-normal model, makes no difference for the two-gamma model, and degrades the performance of the twolognormal model.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Score Distributions in Information Retrieval

We review the history of modeling score distributions, focusing on the mixture of normal-exponential by investigating the theoretical as well as the empirical evidence supporting its use. We discuss previously suggested conditions which valid binary mixture models should satisfy, such as the Recall-Fallout Convexity Hypothesis, and formulate two new hypotheses considering the component distribu...

متن کامل

On Score Distributions and Relevance

We discuss the idea of modelling the statistical distributions of scores of documents, classified as relevant or non-relevant. Various specific combinations of standard statistical distributions have been used for this purpose. Some theoretical considerations indicate problems with some of the choices of pairs of distributions. Specifically, we revisit a generalisation of the well-known inverse...

متن کامل

Using Models of Score Distributions in Information Retrieval

Empirical modeling of a number of different text search engines shows that the score distributions on a per query basis may be fitted approximately using an exponential distribution for the set of nonrelevant documents and a normal distribution for the set of relevant documents. This model fits not only probabilistic search engines like INQUERY but also vector space search engines like SMART an...

متن کامل

Matching Scores of System Relevance and User-Oriented Relevance in SID, ISC and Google Scholar

Background and Aim: The main aim of Information storage and retrieval systems is keeping and retrieving the related information means providing the related documents with users’ needs or requests. This study aimed to answer this question that how much are the system relevance and User- Oriented relevance are matched in SID, SCI and Google Scholar databases. Method: In this study 15 keywords of ...

متن کامل

Score Following and Retrieval Based on Chroma and Octave Representation

With the studies of effective representation of music signals and music scores, i.e. chroma and octave features, this work conducts score following and score retrieval. To complement the shortage of chromagram representation, energy distributions in different octaves are used to describe tone height information. By transforming music signals and scores into sequences of feature vectors, score f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012